Core protocol v3.0 - conceptual model#17
Conversation
|
Oops, I meant to create this PR in my own fork, but got a bit confused while using the github online editor and created it within zarr-developers instead. I'll leave this PR as-is for now to avoid confusion but will try to use my own fork for other PRs. |
| Node names | ||
| ---------- | ||
|
|
||
| TODO define constraints on node names |
There was a problem hiding this comment.
N.B., I don't intend to address this TODO or any below in this PR, just adding them as placeholders for a possible structure for other sections.
|
OK, this is now a reasonably complete first pass. Comments welcome. |
|
In the interests of having a straw man and a rough framework for further work, if no objections I'd like to merge this PR which is going into the core-protocol-v3.0-dev branch. I would certainly expect us to revisit some or all of this and so this is not putting anything in stone. |
|
Not to slow things down here, but do we want to consider soft/hard links? Mentioning this as that might move us away from trees and towards graphs. |
Good question. I was actually thinking that soft/hard links would be something that could be tackled in a protocol extension, but very happy to discuss and revisit. In general I'm thinking there are a number of features that could be handled as protocol extensions, which would give us a mechanism for keeping the core protocol as minimal as possible. I'll try and flesh that out soon. |
|
I'm going to merge this to provide something to build on, but very happy to discuss and revisit any aspect of this as we move forward. |
|
I would not include soft and hard links here as their beavior is specific to various backends. I also find it attractive to think about links on the filesystem as a vehicle to implement non-redundant versioned data but that is a different issue... |
| the transformation (encode), the other to reverse the transformation | ||
| (decode). | ||
|
|
||
| Each node in a hierarchy is represented by a *metadata document*, |
There was a problem hiding this comment.
I suggest to add: An empty metadata document is equivalent with no metadata document and means that there is no meta-data associated with the node. This is only possible for trivial group nodes.
There was a problem hiding this comment.
Interesting, might need to unpack and discuss a little.
FWIW in zarr v2 the presence of a metadata document indicates the existence of a node. E.g., if the key /foo/bar/.zgroup exists in the store then that implies a group exists in the hierarchy at logical path /foo/bar. Similarly if a key exists in the store at /foo/bar/baz/.zarray then that implies an array exists in the hierarchy at logical path /foo/bar/baz. So you can actually construct the hierarchy just by knowing which keys are present in the store, without even retrieving or reading the metadata documents. That sounds slightly different from what you're suggesting here?
joshmoore
left a comment
There was a problem hiding this comment.
One minor comment from a first reading.
| but array nodes may not. | ||
|
|
||
| Each node in a hierarchy has a *name* which is a string of ASCII | ||
| characters with some additional constraints. Two sibling nodes cannot |
There was a problem hiding this comment.
Are there any limitations on the characters in that string?
This PR has some initial drafting of text describing a conceptual model for Zarr, which hopefully helps to provide a foundation for everything else to go into the spec. Work towards #16.